Overview

Brought to you by YData

Dataset statistics

Number of variables25
Number of observations2139
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory417.9 KiB
Average record size in memory200.1 B

Variable types

Numeric9
Categorical16

Alerts

zprior has constant value "1" Constant
cd40 is highly overall correlated with cd420High correlation
cd420 is highly overall correlated with cd40High correlation
cd80 is highly overall correlated with cd820High correlation
cd820 is highly overall correlated with cd80High correlation
cid is highly overall correlated with timeHigh correlation
gender is highly overall correlated with homoHigh correlation
hemo is highly overall correlated with pidnumHigh correlation
homo is highly overall correlated with genderHigh correlation
pidnum is highly overall correlated with hemoHigh correlation
preanti is highly overall correlated with str2 and 2 other fieldsHigh correlation
str2 is highly overall correlated with preanti and 2 other fieldsHigh correlation
strat is highly overall correlated with preanti and 2 other fieldsHigh correlation
time is highly overall correlated with cidHigh correlation
treat is highly overall correlated with trtHigh correlation
trt is highly overall correlated with treatHigh correlation
z30 is highly overall correlated with preanti and 2 other fieldsHigh correlation
hemo is highly imbalanced (58.3%) Imbalance
oprior is highly imbalanced (84.8%) Imbalance
pidnum has unique values Unique
preanti has 873 (40.8%) zeros Zeros

Reproduction

Analysis started2024-12-19 11:04:23.704229
Analysis finished2024-12-19 11:04:53.489856
Duration29.79 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

pidnum
Real number (ℝ)

High correlation  Unique 

Distinct2139
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean248778.25
Minimum10056
Maximum990077
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:04:53.853331image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum10056
5-th percentile11445.9
Q181446.5
median190566
Q3280277
95-th percentile920041.1
Maximum990077
Range980021
Interquartile range (IQR)198830.5

Descriptive statistics

Standard deviation234237.29
Coefficient of variation (CV)0.94155051
Kurtosis2.6146168
Mean248778.25
Median Absolute Deviation (MAD)109114
Skewness1.7362921
Sum5.3213668 × 108
Variance5.4867108 × 1010
MonotonicityStrictly increasing
2024-12-19T11:04:54.443131image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10056 1
 
< 0.1%
251026 1
 
< 0.1%
251049 1
 
< 0.1%
251048 1
 
< 0.1%
251045 1
 
< 0.1%
251044 1
 
< 0.1%
251043 1
 
< 0.1%
251041 1
 
< 0.1%
251039 1
 
< 0.1%
251038 1
 
< 0.1%
Other values (2129) 2129
99.5%
ValueCountFrequency (%)
10056 1
< 0.1%
10059 1
< 0.1%
10089 1
< 0.1%
10093 1
< 0.1%
10124 1
< 0.1%
10140 1
< 0.1%
10165 1
< 0.1%
10190 1
< 0.1%
10198 1
< 0.1%
10229 1
< 0.1%
ValueCountFrequency (%)
990077 1
< 0.1%
990071 1
< 0.1%
990030 1
< 0.1%
990026 1
< 0.1%
990021 1
< 0.1%
990019 1
< 0.1%
990018 1
< 0.1%
980046 1
< 0.1%
980045 1
< 0.1%
980042 1
< 0.1%

cid
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
1618 
1
521 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1618
75.6%
1 521
 
24.4%

Length

2024-12-19T11:04:54.765038image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:04:55.137553image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1618
75.6%
1 521
 
24.4%

Most occurring characters

ValueCountFrequency (%)
0 1618
75.6%
1 521
 
24.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1618
75.6%
1 521
 
24.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1618
75.6%
1 521
 
24.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1618
75.6%
1 521
 
24.4%

time
Real number (ℝ)

High correlation 

Distinct713
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean879.09818
Minimum14
Maximum1231
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:04:55.555460image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile256.7
Q1727
median997
Q31091
95-th percentile1160
Maximum1231
Range1217
Interquartile range (IQR)364

Descriptive statistics

Standard deviation292.27432
Coefficient of variation (CV)0.33247063
Kurtosis0.026238489
Mean879.09818
Median Absolute Deviation (MAD)119
Skewness-1.1217076
Sum1880391
Variance85424.28
MonotonicityNot monotonic
2024-12-19T11:04:56.167661image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1087 34
 
1.6%
1088 24
 
1.1%
1154 18
 
0.8%
1091 16
 
0.7%
1090 16
 
0.7%
1089 16
 
0.7%
993 15
 
0.7%
1132 14
 
0.7%
1097 13
 
0.6%
1117 13
 
0.6%
Other values (703) 1960
91.6%
ValueCountFrequency (%)
14 1
< 0.1%
33 1
< 0.1%
45 1
< 0.1%
50 1
< 0.1%
54 1
< 0.1%
55 1
< 0.1%
62 1
< 0.1%
69 1
< 0.1%
96 1
< 0.1%
105 1
< 0.1%
ValueCountFrequency (%)
1231 3
0.1%
1230 1
 
< 0.1%
1224 4
0.2%
1223 1
 
< 0.1%
1217 1
 
< 0.1%
1214 2
0.1%
1211 1
 
< 0.1%
1209 2
0.1%
1206 1
 
< 0.1%
1203 3
0.1%

trt
Categorical

High correlation 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
3
561 
0
532 
2
524 
1
522 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row3
4th row3
5th row0

Common Values

ValueCountFrequency (%)
3 561
26.2%
0 532
24.9%
2 524
24.5%
1 522
24.4%

Length

2024-12-19T11:04:56.656082image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:04:56.980315image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
3 561
26.2%
0 532
24.9%
2 524
24.5%
1 522
24.4%

Most occurring characters

ValueCountFrequency (%)
3 561
26.2%
0 532
24.9%
2 524
24.5%
1 522
24.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 561
26.2%
0 532
24.9%
2 524
24.5%
1 522
24.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 561
26.2%
0 532
24.9%
2 524
24.5%
1 522
24.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 561
26.2%
0 532
24.9%
2 524
24.5%
1 522
24.4%

age
Real number (ℝ)

Distinct59
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.248247
Minimum12
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:04:57.442709image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile23
Q129
median34
Q340
95-th percentile50
Maximum70
Range58
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.7090262
Coefficient of variation (CV)0.24707686
Kurtosis0.97768528
Mean35.248247
Median Absolute Deviation (MAD)5
Skewness0.64247151
Sum75396
Variance75.847138
MonotonicityNot monotonic
2024-12-19T11:04:57.965284image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 111
 
5.2%
32 109
 
5.1%
30 109
 
5.1%
29 106
 
5.0%
31 105
 
4.9%
37 100
 
4.7%
27 98
 
4.6%
35 98
 
4.6%
36 94
 
4.4%
39 91
 
4.3%
Other values (49) 1118
52.3%
ValueCountFrequency (%)
12 3
 
0.1%
13 3
 
0.1%
14 6
 
0.3%
15 3
 
0.1%
16 7
 
0.3%
17 4
 
0.2%
18 7
 
0.3%
19 7
 
0.3%
20 17
0.8%
21 19
0.9%
ValueCountFrequency (%)
70 2
 
0.1%
69 1
 
< 0.1%
68 2
 
0.1%
67 2
 
0.1%
66 1
 
< 0.1%
65 3
0.1%
64 2
 
0.1%
63 6
0.3%
62 5
0.2%
61 2
 
0.1%

wtkg
Real number (ℝ)

Distinct667
Distinct (%)31.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.125311
Minimum31
Maximum159.93936
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:04:58.313919image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum31
5-th percentile55.48392
Q166.6792
median74.3904
Q382.5552
95-th percentile97.888192
Maximum159.93936
Range128.93936
Interquartile range (IQR)15.876

Descriptive statistics

Standard deviation13.263164
Coefficient of variation (CV)0.17654721
Kurtosis2.2290396
Mean75.125311
Median Absolute Deviation (MAD)7.9096
Skewness0.70648606
Sum160693.04
Variance175.91152
MonotonicityNot monotonic
2024-12-19T11:04:58.657993image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70.7616 26
 
1.2%
77.112 24
 
1.1%
78.0192 23
 
1.1%
73.0296 22
 
1.0%
76.2048 20
 
0.9%
72.576 20
 
0.9%
69.8544 19
 
0.9%
69.4008 18
 
0.8%
86.184 17
 
0.8%
68.04 17
 
0.8%
Other values (657) 1933
90.4%
ValueCountFrequency (%)
31 1
< 0.1%
32.6592 1
< 0.1%
36.78696 1
< 0.1%
41 2
0.1%
41.0508 1
< 0.1%
41.2776 1
< 0.1%
41.4 1
< 0.1%
42.4116 1
< 0.1%
43.00128 1
< 0.1%
43.8 1
< 0.1%
ValueCountFrequency (%)
159.93936 1
< 0.1%
149 1
< 0.1%
135.1728 1
< 0.1%
130.6368 1
< 0.1%
129 1
< 0.1%
127.7 1
< 0.1%
127.008 1
< 0.1%
125.6472 1
< 0.1%
123.3792 1
< 0.1%
122.472 1
< 0.1%

hemo
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
1959 
1
 
180

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1959
91.6%
1 180
 
8.4%

Length

2024-12-19T11:04:58.946921image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:04:59.156756image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1959
91.6%
1 180
 
8.4%

Most occurring characters

ValueCountFrequency (%)
0 1959
91.6%
1 180
 
8.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1959
91.6%
1 180
 
8.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1959
91.6%
1 180
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1959
91.6%
1 180
 
8.4%

homo
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
1414 
0
725 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1414
66.1%
0 725
33.9%

Length

2024-12-19T11:04:59.376099image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:04:59.592237image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1414
66.1%
0 725
33.9%

Most occurring characters

ValueCountFrequency (%)
1 1414
66.1%
0 725
33.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1414
66.1%
0 725
33.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1414
66.1%
0 725
33.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1414
66.1%
0 725
33.9%

drugs
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
1858 
1
281 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1858
86.9%
1 281
 
13.1%

Length

2024-12-19T11:04:59.829978image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:00.060991image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1858
86.9%
1 281
 
13.1%

Most occurring characters

ValueCountFrequency (%)
0 1858
86.9%
1 281
 
13.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1858
86.9%
1 281
 
13.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1858
86.9%
1 281
 
13.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1858
86.9%
1 281
 
13.1%

karnof
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
100
1263 
90
787 
80
 
80
70
 
9

Length

Max length3
Median length3
Mean length2.5904628
Min length2

Characters and Unicode

Total characters5541
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100
2nd row90
3rd row90
4th row100
5th row100

Common Values

ValueCountFrequency (%)
100 1263
59.0%
90 787
36.8%
80 80
 
3.7%
70 9
 
0.4%

Length

2024-12-19T11:05:00.297886image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:00.525738image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
100 1263
59.0%
90 787
36.8%
80 80
 
3.7%
70 9
 
0.4%

Most occurring characters

ValueCountFrequency (%)
0 3402
61.4%
1 1263
 
22.8%
9 787
 
14.2%
8 80
 
1.4%
7 9
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5541
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 3402
61.4%
1 1263
 
22.8%
9 787
 
14.2%
8 80
 
1.4%
7 9
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5541
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 3402
61.4%
1 1263
 
22.8%
9 787
 
14.2%
8 80
 
1.4%
7 9
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5541
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 3402
61.4%
1 1263
 
22.8%
9 787
 
14.2%
8 80
 
1.4%
7 9
 
0.2%

oprior
Categorical

Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
2092 
1
 
47

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2092
97.8%
1 47
 
2.2%

Length

2024-12-19T11:05:00.775127image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:01.001325image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2092
97.8%
1 47
 
2.2%

Most occurring characters

ValueCountFrequency (%)
0 2092
97.8%
1 47
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2092
97.8%
1 47
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2092
97.8%
1 47
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2092
97.8%
1 47
 
2.2%

z30
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
1177 
0
962 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1177
55.0%
0 962
45.0%

Length

2024-12-19T11:05:01.217629image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:01.435880image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1177
55.0%
0 962
45.0%

Most occurring characters

ValueCountFrequency (%)
1 1177
55.0%
0 962
45.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1177
55.0%
0 962
45.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1177
55.0%
0 962
45.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1177
55.0%
0 962
45.0%

zprior
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
2139 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2139
100.0%

Length

2024-12-19T11:05:01.663007image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:01.866017image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 2139
100.0%

Most occurring characters

ValueCountFrequency (%)
1 2139
100.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2139
100.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2139
100.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2139
100.0%

preanti
Real number (ℝ)

High correlation  Zeros 

Distinct813
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean379.17578
Minimum0
Maximum2851
Zeros873
Zeros (%)40.8%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:05:02.115898image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median142
Q3739.5
95-th percentile1303.3
Maximum2851
Range2851
Interquartile range (IQR)739.5

Descriptive statistics

Standard deviation468.65753
Coefficient of variation (CV)1.2359901
Kurtosis0.93929023
Mean379.17578
Median Absolute Deviation (MAD)142
Skewness1.1913747
Sum811057
Variance219639.88
MonotonicityNot monotonic
2024-12-19T11:05:02.412364image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 873
40.8%
768 6
 
0.3%
917 6
 
0.3%
7 5
 
0.2%
175 5
 
0.2%
213 5
 
0.2%
238 5
 
0.2%
925 5
 
0.2%
807 4
 
0.2%
959 4
 
0.2%
Other values (803) 1221
57.1%
ValueCountFrequency (%)
0 873
40.8%
2 2
 
0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
6 4
 
0.2%
7 5
 
0.2%
10 1
 
< 0.1%
12 1
 
< 0.1%
13 1
 
< 0.1%
14 1
 
< 0.1%
ValueCountFrequency (%)
2851 1
< 0.1%
2500 1
< 0.1%
2489 1
< 0.1%
2342 1
< 0.1%
2283 1
< 0.1%
2078 1
< 0.1%
2071 1
< 0.1%
1983 1
< 0.1%
1938 1
< 0.1%
1902 1
< 0.1%

race
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
1522 
1
617 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1522
71.2%
1 617
28.8%

Length

2024-12-19T11:05:02.690417image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:02.907098image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1522
71.2%
1 617
28.8%

Most occurring characters

ValueCountFrequency (%)
0 1522
71.2%
1 617
28.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1522
71.2%
1 617
28.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1522
71.2%
1 617
28.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1522
71.2%
1 617
28.8%

gender
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
1771 
0
368 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1771
82.8%
0 368
 
17.2%

Length

2024-12-19T11:05:03.162779image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:03.376557image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1771
82.8%
0 368
 
17.2%

Most occurring characters

ValueCountFrequency (%)
1 1771
82.8%
0 368
 
17.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1771
82.8%
0 368
 
17.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1771
82.8%
0 368
 
17.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1771
82.8%
0 368
 
17.2%

str2
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
1253 
0
886 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1253
58.6%
0 886
41.4%

Length

2024-12-19T11:05:03.594929image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:03.819204image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1253
58.6%
0 886
41.4%

Most occurring characters

ValueCountFrequency (%)
1 1253
58.6%
0 886
41.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1253
58.6%
0 886
41.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1253
58.6%
0 886
41.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1253
58.6%
0 886
41.4%

strat
Categorical

High correlation 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
886 
3
843 
2
410 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
1 886
41.4%
3 843
39.4%
2 410
19.2%

Length

2024-12-19T11:05:04.047030image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:04.288724image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 886
41.4%
3 843
39.4%
2 410
19.2%

Most occurring characters

ValueCountFrequency (%)
1 886
41.4%
3 843
39.4%
2 410
19.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 886
41.4%
3 843
39.4%
2 410
19.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 886
41.4%
3 843
39.4%
2 410
19.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 886
41.4%
3 843
39.4%
2 410
19.2%

symptom
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
1769 
1
370 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1769
82.7%
1 370
 
17.3%

Length

2024-12-19T11:05:04.516126image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:05.116352image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1769
82.7%
1 370
 
17.3%

Most occurring characters

ValueCountFrequency (%)
0 1769
82.7%
1 370
 
17.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1769
82.7%
1 370
 
17.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1769
82.7%
1 370
 
17.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1769
82.7%
1 370
 
17.3%

treat
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
1
1607 
0
532 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 1607
75.1%
0 532
 
24.9%

Length

2024-12-19T11:05:05.374402image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:05.583884image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1607
75.1%
0 532
 
24.9%

Most occurring characters

ValueCountFrequency (%)
1 1607
75.1%
0 532
 
24.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1607
75.1%
0 532
 
24.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1607
75.1%
0 532
 
24.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1607
75.1%
0 532
 
24.9%

offtrt
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.8 KiB
0
1363 
1
776 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2139
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1363
63.7%
1 776
36.3%

Length

2024-12-19T11:05:05.811767image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T11:05:06.022794image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1363
63.7%
1 776
36.3%

Most occurring characters

ValueCountFrequency (%)
0 1363
63.7%
1 776
36.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1363
63.7%
1 776
36.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1363
63.7%
1 776
36.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1363
63.7%
1 776
36.3%

cd40
Real number (ℝ)

High correlation 

Distinct484
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean350.50117
Minimum0
Maximum1199
Zeros3
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:05:06.286763image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile182
Q1263.5
median340
Q3423
95-th percentile549
Maximum1199
Range1199
Interquartile range (IQR)159.5

Descriptive statistics

Standard deviation118.57386
Coefficient of variation (CV)0.33829805
Kurtosis1.8054107
Mean350.50117
Median Absolute Deviation (MAD)80
Skewness0.75786782
Sum749722
Variance14059.761
MonotonicityNot monotonic
2024-12-19T11:05:06.607027image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
300 27
 
1.3%
280 24
 
1.1%
420 23
 
1.1%
410 19
 
0.9%
380 19
 
0.9%
320 19
 
0.9%
400 18
 
0.8%
230 18
 
0.8%
350 17
 
0.8%
440 16
 
0.7%
Other values (474) 1939
90.6%
ValueCountFrequency (%)
0 3
0.1%
70 1
 
< 0.1%
84 1
 
< 0.1%
99 1
 
< 0.1%
103 1
 
< 0.1%
110 1
 
< 0.1%
112 1
 
< 0.1%
120 1
 
< 0.1%
122 1
 
< 0.1%
123 1
 
< 0.1%
ValueCountFrequency (%)
1199 1
< 0.1%
918 1
< 0.1%
911 1
< 0.1%
834 1
< 0.1%
775 1
< 0.1%
771 1
< 0.1%
770 1
< 0.1%
760 1
< 0.1%
743 1
< 0.1%
740 1
< 0.1%

cd420
Real number (ℝ)

High correlation 

Distinct570
Distinct (%)26.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean371.30715
Minimum49
Maximum1119
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:05:06.922915image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum49
5-th percentile163.9
Q1269
median353
Q3460
95-th percentile618.1
Maximum1119
Range1070
Interquartile range (IQR)191

Descriptive statistics

Standard deviation144.63491
Coefficient of variation (CV)0.38952901
Kurtosis1.0710554
Mean371.30715
Median Absolute Deviation (MAD)95
Skewness0.73034654
Sum794226
Variance20919.257
MonotonicityNot monotonic
2024-12-19T11:05:07.277878image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
300 22
 
1.0%
270 17
 
0.8%
390 17
 
0.8%
380 16
 
0.7%
400 15
 
0.7%
310 15
 
0.7%
290 15
 
0.7%
250 15
 
0.7%
360 14
 
0.7%
320 14
 
0.7%
Other values (560) 1979
92.5%
ValueCountFrequency (%)
49 1
 
< 0.1%
50 1
 
< 0.1%
52 1
 
< 0.1%
53 1
 
< 0.1%
74 1
 
< 0.1%
80 3
0.1%
81 3
0.1%
83 1
 
< 0.1%
87 1
 
< 0.1%
88 2
0.1%
ValueCountFrequency (%)
1119 1
< 0.1%
1100 1
< 0.1%
1040 1
< 0.1%
980 1
< 0.1%
955 1
< 0.1%
930 1
< 0.1%
909 1
< 0.1%
877 1
< 0.1%
865 1
< 0.1%
858 1
< 0.1%

cd80
Real number (ℝ)

High correlation 

Distinct1090
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean986.6274
Minimum40
Maximum5011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:05:07.599295image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile412.7
Q1654
median893
Q31207
95-th percentile1849.4
Maximum5011
Range4971
Interquartile range (IQR)553

Descriptive statistics

Standard deviation480.19775
Coefficient of variation (CV)0.48670628
Kurtosis6.1789811
Mean986.6274
Median Absolute Deviation (MAD)269
Skewness1.7337359
Sum2110396
Variance230589.88
MonotonicityNot monotonic
2024-12-19T11:05:07.941765image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
620 10
 
0.5%
870 10
 
0.5%
1040 10
 
0.5%
880 9
 
0.4%
990 9
 
0.4%
950 9
 
0.4%
1000 9
 
0.4%
680 8
 
0.4%
850 8
 
0.4%
520 8
 
0.4%
Other values (1080) 2049
95.8%
ValueCountFrequency (%)
40 1
< 0.1%
105 1
< 0.1%
116 1
< 0.1%
137 1
< 0.1%
176 1
< 0.1%
177 1
< 0.1%
207 1
< 0.1%
218 1
< 0.1%
221 1
< 0.1%
225 2
0.1%
ValueCountFrequency (%)
5011 1
< 0.1%
4255 1
< 0.1%
3827 1
< 0.1%
3780 1
< 0.1%
3389 1
< 0.1%
3348 1
< 0.1%
3192 1
< 0.1%
3190 1
< 0.1%
3101 1
< 0.1%
3046 1
< 0.1%

cd820
Real number (ℝ)

High correlation 

Distinct1050
Distinct (%)49.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean935.3698
Minimum124
Maximum6035
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.8 KiB
2024-12-19T11:05:08.453223image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum124
5-th percentile399.9
Q1631.5
median865
Q31146.5
95-th percentile1742.8
Maximum6035
Range5911
Interquartile range (IQR)515

Descriptive statistics

Standard deviation444.97605
Coefficient of variation (CV)0.47572206
Kurtosis11.77081
Mean935.3698
Median Absolute Deviation (MAD)253
Skewness2.0914266
Sum2000756
Variance198003.69
MonotonicityNot monotonic
2024-12-19T11:05:09.018702image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
920 11
 
0.5%
730 11
 
0.5%
570 10
 
0.5%
710 10
 
0.5%
740 9
 
0.4%
700 9
 
0.4%
530 9
 
0.4%
590 9
 
0.4%
630 8
 
0.4%
560 8
 
0.4%
Other values (1040) 2045
95.6%
ValueCountFrequency (%)
124 1
< 0.1%
131 1
< 0.1%
140 1
< 0.1%
150 1
< 0.1%
173 1
< 0.1%
195 1
< 0.1%
200 1
< 0.1%
213 1
< 0.1%
214 1
< 0.1%
218 1
< 0.1%
ValueCountFrequency (%)
6035 1
< 0.1%
4113 1
< 0.1%
3552 1
< 0.1%
3407 1
< 0.1%
3130 1
< 0.1%
3044 1
< 0.1%
2856 1
< 0.1%
2807 1
< 0.1%
2801 1
< 0.1%
2798 1
< 0.1%

Interactions

2024-12-19T11:04:49.601589image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:30.112166image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:32.160260image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:35.217797image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:37.421709image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:40.489372image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:43.428927image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:45.527427image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:47.528050image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:49.830428image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:30.328616image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:32.376663image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:35.433017image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:37.741752image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:40.842679image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:43.639165image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:45.750305image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:47.739433image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:50.076274image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:30.565394image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:32.602202image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:35.660922image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:38.072062image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:41.227259image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:43.863078image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:45.968079image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:47.976594image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:50.301124image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:30.800975image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:32.844160image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:35.903993image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:38.420399image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:41.546253image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:44.094622image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:46.192381image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:48.208036image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:50.541218image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:31.045590image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:33.106251image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:36.140868image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:38.718684image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:41.860074image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:44.337020image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:46.428610image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:48.452944image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:50.750134image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:31.263288image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:33.355856image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:36.366575image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:39.096082image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:42.190628image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:44.561739image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:46.645618image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:48.675719image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:50.985464image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:31.481764image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:34.520057image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:36.613302image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:39.427346image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:42.460582image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:44.803582image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:46.876080image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:48.894349image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:51.211942image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:31.713546image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:34.732112image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:36.853345image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:39.786443image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:42.685411image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:45.074514image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:47.089424image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:49.125004image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:51.426394image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:31.929536image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:34.981661image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:37.097572image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:40.145148image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:42.904916image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:45.302355image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:47.299115image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-19T11:04:49.363217image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Correlations

2024-12-19T11:05:09.396114image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
agecd40cd420cd80cd820ciddrugsgenderhemohomokarnofofftrtopriorpidnumpreantiracestr2stratsymptomtimetreattrtwtkgz30
age1.000-0.040-0.0500.0540.0470.0690.1550.0580.4230.2290.0650.0070.071-0.1520.1270.0960.1160.1070.0910.0450.0230.0000.1520.099
cd40-0.0401.0000.6200.2280.0780.1890.0000.0000.0000.0000.0500.1470.0290.032-0.1100.0000.1210.0810.1230.1910.0000.0000.0470.115
cd420-0.0500.6201.0000.0590.2180.3900.0000.0000.0430.0000.0490.2010.113-0.013-0.2040.0360.2290.1570.1280.3160.1480.0960.0320.211
cd800.0540.2280.0591.0000.7420.0500.0070.1020.0850.1200.0000.0500.000-0.0730.0380.0220.0000.0480.0280.0230.0000.0000.0940.005
cd8200.0470.0780.2180.7421.0000.0310.0000.1010.0890.1040.0000.0000.000-0.0660.0280.0550.0150.0270.0500.0420.0000.0400.0840.000
cid0.0690.1890.3900.0500.0311.0000.0430.0380.0000.0490.0960.0890.0310.0000.1390.0500.1200.1280.1260.6330.1260.1270.0190.123
drugs0.1550.0000.0000.0070.0000.0431.0000.1380.0880.2040.0860.0940.0130.1320.0000.0780.0000.0040.0130.0000.0000.0000.0550.000
gender0.0580.0000.0000.1020.1010.0380.1381.0000.1120.6060.0170.0000.0320.1550.0220.2900.0210.0810.0590.1040.0070.0000.3280.027
hemo0.4230.0000.0430.0850.0890.0000.0880.1121.0000.3890.0640.0000.0200.8180.1400.0650.1210.1390.0710.0610.0000.0000.0970.108
homo0.2290.0000.0000.1200.1040.0490.2040.6060.3891.0000.0360.0380.0000.3810.0720.3050.0280.0340.1150.0540.0090.0000.2550.044
karnof0.0650.0500.0490.0000.0000.0960.0860.0170.0640.0361.0000.1020.0510.1030.1460.0000.0840.0730.1020.0920.0000.0000.0000.077
offtrt0.0070.1470.2010.0500.0000.0890.0940.0000.0000.0380.1021.0000.0000.0280.0680.0000.0140.0650.0670.4880.0460.0590.0330.018
oprior0.0710.0290.1130.0000.0000.0310.0130.0320.0200.0000.0510.0001.0000.0610.1280.0000.1210.1310.0000.0230.0180.0110.0000.027
pidnum-0.1520.032-0.013-0.073-0.0660.0000.1320.1550.8180.3810.1030.0280.0611.000-0.0550.1530.1970.1460.098-0.1060.0000.000-0.0680.179
preanti0.127-0.110-0.2040.0380.0280.1390.0000.0220.1400.0720.1460.0680.128-0.0551.0000.1290.7330.6820.0380.0870.0000.000-0.0760.690
race0.0960.0000.0360.0220.0550.0500.0780.2900.0650.3050.0000.0000.0000.1530.1291.0000.0760.1100.0740.0920.0000.0000.1290.069
str20.1160.1210.2290.0000.0150.1200.0000.0210.1210.0280.0840.0140.1210.1970.7330.0761.0001.0000.0200.1460.0000.0000.1020.902
strat0.1070.0810.1570.0480.0270.1280.0040.0810.1390.0340.0730.0650.1310.1460.6820.1101.0001.0000.0340.1050.0000.0000.0740.905
symptom0.0910.1230.1280.0280.0500.1260.0130.0590.0710.1150.1020.0670.0000.0980.0380.0740.0200.0341.0000.1520.0000.0000.0000.000
time0.0450.1910.3160.0230.0420.6330.0000.1040.0610.0540.0920.4880.023-0.1060.0870.0920.1460.1050.1521.0000.1610.0880.0200.147
treat0.0230.0000.1480.0000.0000.1260.0000.0070.0000.0090.0000.0460.0180.0000.0000.0000.0000.0000.0000.1611.0001.0000.0290.000
trt0.0000.0000.0960.0000.0400.1270.0000.0000.0000.0000.0000.0590.0110.0000.0000.0000.0000.0000.0000.0881.0001.0000.0000.000
wtkg0.1520.0470.0320.0940.0840.0190.0550.3280.0970.2550.0000.0330.000-0.068-0.0760.1290.1020.0740.0000.0200.0290.0001.0000.095
z300.0990.1150.2110.0050.0000.1230.0000.0270.1080.0440.0770.0180.0270.1790.6900.0690.9020.9050.0000.1470.0000.0000.0951.000

Missing values

2024-12-19T11:04:52.147670image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-19T11:04:53.057338image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

pidnumcidtimetrtagewtkghemohomodrugskarnofopriorz30zpriorpreantiracegenderstr2stratsymptomtreatofftrtcd40cd420cd80cd820
010056094824889.812800010000100001010422477566324
1100591100236149.4424000900118950013010162218392564
210089096134588.452001190011707011301132627420631893
3100930116634785.2768010100011139901130102873941590966
4101240109004366.679201010001113520113000504353870782
5101400118114688.9056011100011118101130102353398601060
610165179403173.02960101000119300113000244225708699
710190095704166.225601110001113290113000401366889720
810198119834082.55520109001110740113111214107652131
910229118803578.01920101000119640113001221132221759
pidnumcidtimetrtagewtkghemohomodrugskarnofopriorz30zpriorpreantiracegenderstr2stratsymptomtreatofftrtcd40cd420cd80cd820
2129980042158801663.00001001000117530113000299214546471
21309800450108722578.00001001000119050113010468594636554
2131980046094812072.40001001000010000101148364117281504
2132990018041332780.287210070011207801130113212229101009
21339900191104123964.86481009001110420113011378401504367
21349900210109132153.29801001000118420113011152109561720
21359900260395017102.9672100100011417111300137321817591030
21369900300110425369.854411090011753111301041936413911041
2137990071146501460.0000100100001001010001661699991838
21389900770104534577.300010010000100101010911930885526